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Abstract 

This  paper  proposes  a  new  architecture  for  memory- 
based  floating-point  numeric  function  generators  (NFGs). 
The  design  method  uses  piecewise-split  edge-valued  multi¬ 
valued  decision  diagrams  { EVMDDs ).  To  design  NFGs  with 
less  memory  size,  we  partition  the  domain  of  the  floating¬ 
point  function  into  segments,  and  represent  the  function 
using  an  EVMDD  for  each  segment.  By  realizing  each 
EVMDD  with  hardware,  we  obtain  the  floating-point  NFG. 
This  paper  also  presents  an  algorithm  that  partitions  the  do¬ 
main  by  decomposing  the  edge-valued  binary  decision  dia¬ 
gram  (EVBDD)  representing  the  whole  floating-point  func¬ 
tion.  Experimental  results  show  that,  for  a  single-precision 
floating-point  function,  our  new  NFG  requires  40%  to  65% 
less  memory  than  any  previous  one  for  generic  function. 


1.  Introduction 

Numeric  functions,  such  as  trigonometric,  logarithmic, 
square  root,  and  combinations  of  these  functions,  are  widely 
used  in  computer  graphics,  digital  signal  processing,  com¬ 
munication  systems,  robotics,  etc.  [11].  In  these  applica¬ 
tions,  numeric  functions  are  usually  used  as  a  basic  opera¬ 
tion,  along  with  addition  and  multiplication.  For  fast  com¬ 
putation  of  numeric  functions,  various  hardware  circuits, 
called  numeric  function  generators  (NFGs),  have  been  pro¬ 
posed.  Fixed-point  representation  of  numeric  functions  is 
often  adopted  [1,9, 14, 16, 18, 19,21,22].  However,  for  nu¬ 
meric  functions  with  a  wide  domain  and  range,  a  fixed-point 
representation  requires  a  large  number  of  bits.  This  pro¬ 
duces  large  NFGs.  To  represent  a  large  real  value  with  fewer 
bits,  floating-point  representation  is  often  used.  An  IEEE 
standard  for  real  values  exists  [5].  However,  floating-point 
representation  tends  to  produce  complex  and  slow  NFGs. 
Thus,  the  design  of  floating-point  NFGs  is  especially  hard, 
and  only  design  methods  for  some  numeric  functions  are 
known  [2,3,6,20,24].  Since  these  design  methods  are  in¬ 
tended  only  for  specific  functions,  different  functions  need 


different  design  methods  and  architectures.  Thus,  an  archi¬ 
tecture  for  floating-point  NFGs  that  can  realize  a  large  set 
of  numeric  functions  is  required,  along  with  a  systematic 
design  method. 

Large-capacity  memories  are  now  available.  Like 
SRAM-based  FPGAs,  implementation  of  logic  circuits  us¬ 
ing  memory  has  become  practical.  Thus,  we  focus  on 
memory-based  floating-point  NFGs  that  can  realize  various 
numeric  functions  by  changing  data  in  memory.  A  straight¬ 
forward  memory-based  NFG  for  an  arbitrary  floating-point 
function  f(x )  is  a  single  lookup  table  (LUT),  in  which 
the  address  is  the  binary  representation  of  the  value  of  x 
and  the  content  of  that  address  is  the  corresponding  value 
of  f{x).  This  NFG  is  very  fast  because  the  value  of 
the  function  is  obtained  by  only  one  table  lookup.  For 
high-precision  computations,  however,  it  requires  too  much 
memory.  To  design  monotone  floating-point  numeric  func¬ 
tions  with  less  memory,  we  have  proposed  a  memory-based 
NFG  using  an  edge-valued  multi-valued  decision  diagram 
(EVMDD)  [15].  Unfortunately,  it  still  requires  a  large  mem¬ 
ory  for  high-precision  computations. 

Therefore,  this  paper  proposes  a  new  architecture  for 
memory-based  NFGs  that  requires  less  memory.  To  de¬ 
sign  an  NFG  with  less  memory,  we  partition  the  domain 
of  a  floating-point  function  into  segments,  and  represent  the 
function  using  an  EVMDD  for  each  segment.  By  realizing 
each  EVMDD  with  memory,  we  obtain  a  memory-based 
floating-point  NFG.  This  paper  also  presents  an  algorithm 
that  partitions  the  domain  by  decomposing  the  edge-valued 
binary  decision  diagram  (EVBDD)  representing  the  whole 
floating-point  function. 

This  paper  is  organized  as  follows:  Section  2  intro¬ 
duces  a  floating-point  representation  of  a  real-valued  nu¬ 
meric  function,  and  decision  diagrams  used  in  this  paper. 
Section  3  presents  piecewise-split  EVMDDs,  and  an  algo¬ 
rithm  that  partitions  the  domain  of  a  floating-point  func¬ 
tion.  Section  4  presents  a  new  architecture  for  memory- 
based  floating-point  NFGs.  Experimental  results  are  shown 
in  Section  5. 
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Table  1.  Floating-point  representation  of  X 
with  a-bit  exponent  and  £>-bit  significand. 


Type 

|  Exponent  E 

Significand  D 

Value  of  X 

Zero 

(0,0,.. 

■  ,0)2 

(0,0 . 0)2 

0 

Subnormal  no. 

(0,0... 

•  ,0)2 

f  (0,0,..., 0)2 

(-l)s  x2-£>  xO D 

Infinity 

(1.1.- 

.,1)2 

(0.0 . 0)2 

(— l)5  X  00 

Not  a  no.  (NaN) 

(1.1.- 

•  »!)2 

f  (0,0,..., 0)2 

NaN 

Normal  no. 

Others 

(-1)sx2£-e"  X  1  D 

Bias  value  for  subnormal  numbers:  Es  =  2a  1 
Bias  value  for  normal  numbers:  E„  =  2a  1  —  1 


2.  Preliminaries 

2.1.  Number  Representation  and  Precision 

This  subsection  defines  a  number  representation  and  a 
precision  to  convert  real  functions  into  integer  functions. 

Definition  1  Let  Z?  =  {0,1},  Z  be  the  set  of  the  integers, 
and  R.  be  the  set  of  the  real  numbers.  An  n-input  m-output 
logic  function  is  a  mapping:  B"  — >  B'",  an  integer  function 
is  a  mapping:  Z  — >  Z,  and  a  real  function  is  a  mapping: 

R->R. 

Definition  2  The  n-bit  precision  binary  floating-point  rep¬ 
resentation  of  a  number  X  is  a  binary  n-tuple 

X  =  (s,  ea-i,ea-2,---,eo,  ■  ■■.^0)2. 

where  s  G  B  is  the  sign  bit,  E  =  (ea-i,ea-2,  ■  ■  ■  ,£0)2  !S 
exponent,  and  D  =  (db-\,db-2,  ■  ■  ■  ,do)i  is  the  significand. 
a  and  b  are  the  numbers  of  bits  for  the  exponent  and  the 
significand  respectively,  and  ti  =  a  +  b+  1.  The  value  ofX 
is  shown  in  Table  1.  When  |X|  <  2 2  2"  ,  X  is  a  subnormal 
number,  in  which  the  exponent  E  is  biased  by  Es  =  2“  1  2, 

and  the  significand  D  represents  only  fractional  bits  of  a 
fixed-point  value  smaller  than  1.  When  2a  <  \X\,  X  is  infin¬ 
ity.  When  X  cannot  be  defined  as  a  number,  X  is  represented 
as  not  a  number  (NaN).  In  other  cases,  X  is  a  normal  num¬ 
ber,  in  which  the  exponent  E  is  biased  by  En  =  2a~1  —  1,  and 
the  significand  D  represents  only  fractional  bits  of  a  fixed- 
point  value  that  is  larger  than  or  equal  to  1  and  smaller 
than  2. 

According  to  IEEE  Standard  754-2008  [5 ],  half  ( 16-bit ) 
precision  has  a  =  5  and  b  =  10,  single  (32-bit)  precision 
has  <7  =  8  and  b  =  23,  double  (64-bit) precision  has  (7=11 
and  b  =  52,  and  quad  (128-bit)  precision  has  (7=15  and 
£>=112. 

Note  that  by  using  an  n-bit  precision  floating-point  rep¬ 
resentation,  we  can  convert  a  real  function  into  an  /(-input 
n-output  logic  function.  The  logic  function,  in  turn,  can 


Table  2.  Tables  for  the  8-bit  precision  (3-bit  ex¬ 
ponent,  4-bit  significand)  floating-point  \fx. 

(a)  Table  for  fix.  (b)  Truth  table  for  fb(X).  (c)  Table  for /(X). 


X 

Vx 

0.000000 

0.000000 

0.015625 

0.125000 

0.031250 

0.171875 

0.046875 

0.218750 

0.062500 

0.250000 

0.078125 

0.281250 

0.093750 

0.312500 

0.109375 

0.328125 

X 

m 

0 

0 

1 

8 

2 

11 

3 

14 

4 

16 

5 

18 

6 

20 

7 

21 

X 

MX) 

0  000  0000 

0  000  0000 

0  000  0001 

0  000  1000 

0  000  0010 

0  000  1011 

0  000  0011 

0000  1110 

0  000  0100 

0  001  0000 

0  000  0101 

0  001  0010 

0  000  0110 

0  001  0100 

0  000  0111 

0  001  0101 

be  converted  into  an  integer  function  by  considering  bi¬ 
nary  vectors  as  integers.  That  is,  we  can  convert  a  real 
function  into  an  integer  function:  Pn  — >  P„,  where  Pn  = 
{0, 1 _ ,2n  —  1}.  In  this  paper,  numeric  functions  are  con¬ 

verted  into  integer  functions  by  using  a  floating-point  repre¬ 
sentation,  unless  stated  otherwise.  And,  for  simplicity,  each 
bit  in  the  floating-point  representation  of  X  is  denoted  by  Xj, 
where  xo  is  the  least  significant  bit. 

Example  1  Table  2  (a)  is  the  function  table  for  \/~X.  The  8- 
bit  precision  (3-bit  exponent  and  4-bit  significand)  floating¬ 
point  representation  of  this  function  is  the  logic  function 
fb(X )  in  Table  2  (b).  By  converting  binary  vectors  into  in¬ 
tegers,  we  have  the  integer  function  f(X)  of  fb(X)  in  Ta¬ 
ble  2  (c).  That  is,  our  8,-bit  precision  floating-point  repre¬ 
sentation  of  f(X)  =  \fX  corresponds  to  the  integer  function 
of  Table  2  (c).  (End  of  Example) 

2.2.  Edge- Valued  MDDs 

This  subsection  defines  the  decision  diagrams  used  in 
this  paper. 

Definition  3  An  edge-valued  binary  decision  diagram 
(EVBDD)  [8,  17,  23]  is  a  variant  of  the  BDD  [4, 10]  that 
represents  an  integer  function.  The  EVBDD  is  obtained  by 
repeatedly  applying  the  expansion  f  =  x]fo+Xi(f[  +  a)  to 
the  integer  function,  where  fi  =  f[  +  OC,  and  a  is  the  con¬ 
stant  term  of  fi.  The  EVBDD  consists  of  only  one  terminal 
node  representing  0  and  non-terminal  nodes  with  1  -edges 
having  integer  weights  a.  In  an  EVBDD,  0 -edges  always 
have  zero  weights.  The  incoming  edge  into  the  root  node 
can  have  a  non-zero  weight.  The  output  ( integer )  value  of 
an  EVBDD  is  the  sum  of  the  weights  associated  with  the 
path  taken  from  the  root  node  to  the  terminal  node. 

Definition  4  For  a  set  of  n  binary  variables  \X\,  if  {X }  = 
{X„}  U  {XB_i}  U  . . .  U  {v},  {Xi}  fi  0,  and  {Xfi  n  {Xj}  = 
0  (i  fi  j),  then  (X,,.XU  1 , . . .  ,Zj)  is  a  partition  of  X.  Each 
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Figure  1.  EVBDD  and  EVMDD  for  an  integer 
function. 


EVMDDO  EVMDD  1  EVMDD2  EVMDD3 


Figure  3.  Example  of  a  piecewise-split 
EVMDD. 


x„ 


X„-l 


x„- 2 


Xl 


Fig.  1  (a),  dashed  lines  and  solid  lines  denote  0-edges  and 
weighted  l-edges,  respectively.  In  the  EVMDD ,  the  set  of 
binary  variables  {X}  is  partitioned  into  {X2}  =  {x$,X2,xi} 
and  {Xi}  =  {jco}-  To  obtain  the  function  value  3  for  X  = 
(1,0, 1.0)2,  we  traverse  the  EVBDD  or  the  EVMDD  from 
the  root  node  to  the  terminal  node  according  to  the  input 
values,  and  obtain  the  function  value  as  the  sum  of  the 
weights  for  the  traversed  edges.  Note  that  we  traverse  the 
EVMDD  using  X2  -  5  and  X\  =  0.  (End  of  Example) 

3.  Representation  Using  Piecewise- Split 
EVMDDs 


Figure  2.  Architecture  for  NFG  based  on  a 
monolithic  EVMDD. 


Xj  forms  a  super  variable.  Let  X,  =  lq  and  ku  +  ku  \  + 
. . .  +  k  1  =  n.  Then,  by  considering  each  super  variable  as 
a  multi-valued  variable,  an  integer  function  /(X)  :  Z  — >  Z 
can  be  converted  into  a  multi-valued  input  integer  function 
/(XM,XM_i, . . .  ,Xi)  :  Pu  x  P„_!  x  . . .  x  Pi  — *■  Z,  where  Pt  = 
{0, 1,2,..  .,2ki  —  1}. 

Definition  5  An  edge-valued  multi-valued  decision  dia¬ 
gram  (EVMDD)  [13]  is  an  extension  of  the  MDD  [7,  12, 
25],  and  represents  a  multi-valued  input  integer  function. 
It  consists  of  one  terminal  node  representing  0  and  non¬ 
terminal  nodes.  Edges  have  integer  weights.  Edges  labeled 
by  a  logic  0  have  integer  0  weight. 

Example  2  Fig.  1  (a)  and  (b)  show  the  EVBDD  and  the 
EVMDD,  respectively,  for  the  same  integer  function.  In 


This  section  introduces  a  piecewise-split  EVMDD,  and 
presents  an  algorithm  that  partitions  the  domain  of  a 
floating-point  function. 

3.1.  Piecewise-Split  EVMDDs 

Since  floating-point  numeric  functions  can  be  converted 
into  integer  functions,  they  can  be  compactly  represented 
using  EVMDDs.  Fig.  2  shows  their  architecture,  which  is 
the  result  of  a  decomposition  of  an  EVMDD  [15].  How¬ 
ever,  high-precision  (a  large  number  of  bits)  floating-point 
functions  necessarily  require  large  EVMDDs  resulting  in 
NFGs  with  large  memory  size.  Also,  they  are  slow.  To 
reduce  memory  size,  we  represent  a  floating-point  numeric 
function  using  a  set  of  smaller  EVMDDs,  instead  of  using  a 
monolithic  EVMDD.  Then,  we  design  a  floating-point  NFG 
using  the  smaller  EVMDDs. 

In  this  paper,  we  produce  a  set  of  EVMDDs  by  parti¬ 
tioning  the  domain  of  a  floating-point  function  into  seg¬ 
ments,  and  representing  the  function  using  an  EVMDD  for 
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Figure  4.  Segmentation  algorithm  using 
EVBDD. 


each  segment,  as  shown  in  Fig.  3.  Hence,  we  call  the  set 
of  EVMDDs  a  piecewise-split  EYMDD.  Note  that,  in  a 
piecewise-split  EVMDD,  we  need  an  EVMDD  which  se¬ 
lects  a  segment  from  input  X,  in  addition  to  the  EVMDD 
for  each  segment. 

3.2.  Segmentation  Algorithm  Using 
EVBDD 


In  a  piecewise- split  EVMDD,  the  number  of  EVMDDs 
and  size  of  each  EVMDD  depend  on  how  the  domain  of  the 
function  is  segmented.  Thus,  an  effective  segmentation  al¬ 
gorithm  which  makes  the  sizes  of  all  the  EVMDDs  the  same 


multi-terminal  EVBDD 


(b)  Piecewise-split  EVMDD 


Figure  5.  Decomposition  of  EVBDD  and 
piecewise-spiit  EVMDD. 


is  desired  to  reduce  memory  size  of  NFG.  In  this  paper,  we 
propose  a  segmentation  algorithm  that  partitions  the  domain 
into  segments  by  decomposing  the  EVBDD  representing  the 
whole  floating-point  function.  By  using  the  EVBDD,  we 
can  simultaneously  produce  a  segmentation  of  the  domain 
and  the  piecewise- split  EVMDD. 

Fig.  4  shows  the  proposed  segmentation  algorithm.  This 
algorithm  decomposes  a  given  EVBDD  so  that  sizes  of  all 
sub-EVBDDs  are  smaller  than  or  equal  to  a  given  thresh¬ 
old  value,  as  shown  in  Fig.  5(a).  And  then,  it  produces  a 
piecewise-split  EVMDD  as  shown  in  Fig.  5(b).  In  Fig.  5(a), 
sub-EVBDDs  whose  heights  are  lower  (i.e,  in  which  fewer 
input  variables  are  used)  mean  that  their  segments  are  nar¬ 
rower.  In  such  segments,  the  floating-point  function  values 
rapidly  change,  and  so,  sizes  of  EVBDDs  are  large  [15]. 
In  Fig.  5(b),  the  multi-terminal  EVMDD  is  used  to  select  a 
segment,  and  the  other  EVMDDs  are  used  to  represent  the 
floating-point  function  in  each  segment. 

4.  NFGs  Based  on  Piecewise-Split  EVMDDs 

By  realizing  each  EVMDD  produced  by  the  proposed 
segmentation  algorithm  using  the  architecture  shown  in 
Fig.  2,  we  obtain  the  NFG  in  Fig.  6.  A  value  of  the  nu¬ 
meric  function  in  each  segment  is  computed  using  the  least 
significant  bits  of  X  in  parallel,  and  then  an  appropriate 
value  is  selected  by  the  segment  selector,  which  realizes  a 
multi-terminal  EVMDD.  Since  a  piecewise-split  EVMDD 
uses  the  most  significant  bits  of  X  in  parallel  with  function 
values,  it  is  faster  than  the  monolithic  EVMDD. 

In  addition,  the  proposed  NFG  has  the  following  advan¬ 
tages: 


1.  Since  the  proposed  NFG  just  traverses  a  set  of 
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Table  3.  Memory  size  needed  for  NFGs  based 
on  a  monolithic  EVMDD  and  a  piecewise-split 
EVMDD  for  single-precision  (32-bit)  floating- 
point  numeric  functions. _ 


Functions 

m 

Memory  size  (Mbits) 

Number 

of 

segments 

Ratio 

(%) 

Monolithic 

EVMDD 

Piecewise 

EVMDD 

sin_1(X) 

34 

12 

28 

35 

ln(X) 

564 

337 

25 

60 

l/X 

31 

15 

31 

51 

Vx 

34 

13 

23 

39 

s/-Mx) 

271 

160 

32 

59 

Ratio  = 


(Piecewise  EVMDD) 
(Monolithic  EVMDD) 


x  100(%). 


EVMDDs,  and  computes  the  sum  of  edge  weights  (in¬ 
tegers)  in  parallel,  it  requires  only  integer  adders  to 
compute  function  values  of  a  floating-point  function. 
That  is,  it  requires  neither  the  rounding  circuit  nor  the 
normalization  circuit  (which  are  complex). 

2.  Since  the  NFG  is  a  memory-based  architecture,  a  wide 
range  of  numeric  functions  can  be  realized  by  chang¬ 
ing  only  the  data  in  the  LUT  memories. 

3.  Since  the  NFG  directly  realizes  the  function  table 
of  a  floating-point  function  using  a  piecewise- split 
EVMDD,  it  is  more  accurate  than  existing  NFGs  us¬ 
ing  polynomial  approximation  [2, 3, 6, 20, 24], 

4.  The  NFG  is  suitable  for  pipeline  processing,  and  thus 
it  can  achieve  a  high  throughput. 

5.  Experimental  Results 

To  show  the  effectiveness  of  piecewise-split  EVMDDs, 
we  realize  single-precision  floating-point  numeric  functions 
using  two  types  of  NFGs,  an  NFG  based  on  a  monolithic 
EVMDD  and  an  NFG  based  on  a  piecewise- split  EVMDD. 
We  compare  memory  size  needed  for  the  two  types  of 
NFGs.  Table  3  shows  memory  size  needed  for  the  NFGs,  in 
mega  bits,  and  the  number  of  segments  for  piecewise-split 
EVMDDs. 

Since  a  single  LUT  that  realizes  a  whole  single-precision 
floating-point  function  requires  232  x  32  =  128  Gbits,  mem¬ 
ory  size  needed  for  both  types  of  NFGs  is  three  or  four 
orders  of  magnitude  less  than  the  single  LUT-based  NFG. 
And,  by  using  piecewise-split  EVMDDs,  we  can  achieve 
a  further  reduction  to  35%  to  60%  of  memory  size  needed 
for  the  NFGs  based  on  the  monolithic  EVMDDs.  Table  3 
shows  that  we  can  reduce  memory  size  significantly  with  a 


small  number  of  segments.  Thus,  the  size  of  the  multiplexer 
used  in  the  NFG  is  also  small.  Further,  we  can  generate  such 
compact  NFGs  automatically. 

6.  Conclusion  and  Comments 

This  paper  proposes  a  new  architecture  for  memory- 
based  floating-point  NFGs,  and  a  design  method  using 
piecewise-split  EVMDDs.  We  also  present  an  algorithm  to 
produce  efficient  piecewise-split  EVMDDs  by  decompos¬ 
ing  the  EVBDD  representing  a  given  floating-point  func¬ 
tion.  Experimental  results  show  that,  for  single-precision 
floating-point  functions,  our  new  NFGs  based  on  piecewise- 
split  EVMDDs  require  40%  to  65%  less  memory  than  ones 
based  on  monolithic  EVMDDs.  By  using  piecewise-split 
EVMDDs,  we  can  automatically  generate  compact  NFGs. 
Since  our  memory -based  NFG  is  quite  general,  it  can  realize 
not  only  floating-point  functions,  but  also  discrete  functions 
and  even  two-variable  functions. 

Future  work  includes  1)  reducing  memory  size  further, 
and  2)  developing  an  optimization  algorithm  for  decompos¬ 
ing  an  EVBDD.  Our  NFG  requires  still  large  memory  size 
for  FPGA  implementation.  We  will  further  reduce  memory 
size  so  that  our  single-precision  floating-point  NFG  can  be 
implemented  with  an  FPGA.  The  proposed  algorithm  de¬ 
composes  an  EVBDD  using  a  threshold  value  for  the  num¬ 
ber  of  nodes.  But,  an  algorithm  that  can  find  an  optimum 
decomposition  of  EVBDD  in  terms  of  memory  size  or  de¬ 
lay  time  of  NFG  is  more  practical. 
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